CS294-1 Homework 2: Linear Regression
نویسندگان
چکیده
The goal of this homework assignment was to use linear regression to predict users’ ratings of books sold on Amazon.com. To construct the feature matrix, we tokenized the written review texts in several different ways: With and without stemming, as well as using either single words as tokens or bigrams. The dependent variable is a star rating given by reviewers, ranging from 1 star (very unsatisfied) to 5 stars (very satisfied). In the original dataset, the 3-star ratings were removed, so the others could be grouped to create a binary classification problem: 1and 2-star ratings are considered negative reviews, while 4and 5-star ratings are considered positive ratings.
منابع مشابه
Cs294 a Toolkit for Algorithms 1.1 Problems and Open Questions
1. Explore applications of the recent ”Quantum Algorithm for Linear Systems of Equations” by Harrow, Hassidim and Lloyd [5]. This algorithm can solve a linear system of the form Ax = b, in time logarithmic in the dimensions of A, an exponential improvement over classical algorithms. However, the quantum algorithm cannot output the vector solution x explicitly, but only give an approximation to ...
متن کاملRevising on the run or studying on the sofa: prospective associations between physical activity, sedentary behaviour, and exam results in British adolescents
BACKGROUND We investigated prospective associations between physical activity/sedentary behaviour (PA/SED) and General Certificate of Secondary Education (GCSE) results in British adolescents. METHODS Exposures were objective PA/SED and self-reported sedentary behaviours (screen (TV, Internet, Computer Games)/non-screen (homework, reading)) measured in 845 adolescents (14·5y ± 0·5y; 43·6 % ma...
متن کاملECE 661 Computer Vision : HW 5 Dae
In the last homework, the DLT algorithm was used to find the homography which minimized an algebraic cost function. In this homework, we wish to find the homography H which minimizes the geometric cost function given by Equation 4.6 in the textbook (pg. 94) (1) where d() is Euclidean distance. Equation (1) can be rewritten as a nonlinear least squares problem with the form C(p) = X − F (p) 2 (2...
متن کاملAMS 274 Generalized Linear Models Homework 4 Nov 29 2016
1. The table below reports from a developmental toxicity study involving ordinal categorical outcomes. This study administered diethylene glycol dimethyl ether (an industrial solvent used in the manufacture of protective coatings) to pregnant mice. Each mouse ws exposed to one of five concentration levels for ten days early in the pregnancy (with concentration 0 corresponding to controls). Two ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012